goal-conditioned rl agent
- Europe > Austria > Vienna (0.14)
- Asia > Middle East > Republic of Türkiye > Aksaray Province > Aksaray (0.04)
- North America > United States (0.04)
- (2 more...)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Robots (0.94)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.67)
Instructing Goal-Conditioned Reinforcement Learning Agents with Temporal Logic Objectives
Goal-conditioned reinforcement learning (RL) is a powerful approach for learning general-purpose skills by reaching diverse goals. However, it has limitations when it comes to task-conditioned policies, where goals are specified by temporally extended instructions written in the Linear Temporal Logic (LTL) formal language. Existing approaches for finding LTL-satisfying policies rely on sampling a large set of LTL instructions during training to adapt to unseen tasks at inference time. However, these approaches do not guarantee generalization to out-of-distribution LTL objectives, which may have increased complexity. In this paper, we propose a novel approach to address this challenge. We show that simple goal-conditioned RL agents can be instructed to follow arbitrary LTL specifications without additional training over the LTL task space. Unlike existing approaches that focus on LTL specifications expressible as regular expressions, our technique is unrestricted and generalizes to $\omega$-regular expressions. Experiment results demonstrate the effectiveness of our approach in adapting goal-conditioned RL agents to satisfy complex temporal logic task specifications zero-shot.
Learning Domain Invariant Representations in Goal-conditioned Block MDPs
Deep Reinforcement Learning (RL) is successful in solving many complex Markov Decision Processes (MDPs) problems. However, agents often face unanticipated environmental changes after deployment in the real world. These changes are often spurious and unrelated to the underlying problem, such as background shifts for visual input agents. Unfortunately, deep RL policies are usually sensitive to these changes and fail to act robustly against them. This resembles the problem of domain generalization in supervised learning. In this work, we study this problem for goal-conditioned RL agents. We propose a theoretical framework in the Block MDP setting that characterizes the generalizability of goal-conditioned policies to new environments. Under this framework, we develop a practical method PA-SkewFit that enhances domain generalization. The empirical evaluation shows that our goal-conditioned RL agent can perform well in various unseen test environments, improving by 50\% over baselines.
- Europe > Austria > Vienna (0.14)
- Asia > Middle East > Republic of Türkiye > Aksaray Province > Aksaray (0.04)
- North America > United States (0.04)
- (2 more...)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Robots (0.94)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.67)
Instructing Goal-Conditioned Reinforcement Learning Agents with Temporal Logic Objectives
Goal-conditioned reinforcement learning (RL) is a powerful approach for learning general-purpose skills by reaching diverse goals. However, it has limitations when it comes to task-conditioned policies, where goals are specified by temporally extended instructions written in the Linear Temporal Logic (LTL) formal language. Existing approaches for finding LTL-satisfying policies rely on sampling a large set of LTL instructions during training to adapt to unseen tasks at inference time. However, these approaches do not guarantee generalization to out-of-distribution LTL objectives, which may have increased complexity. In this paper, we propose a novel approach to address this challenge. We show that simple goal-conditioned RL agents can be instructed to follow arbitrary LTL specifications without additional training over the LTL task space.
Learning Domain Invariant Representations in Goal-conditioned Block MDPs
Deep Reinforcement Learning (RL) is successful in solving many complex Markov Decision Processes (MDPs) problems. However, agents often face unanticipated environmental changes after deployment in the real world. These changes are often spurious and unrelated to the underlying problem, such as background shifts for visual input agents. Unfortunately, deep RL policies are usually sensitive to these changes and fail to act robustly against them. This resembles the problem of domain generalization in supervised learning.
RL for Planning and Planning for RL
The figure above illustrates the method:(a) Goal-conditioned RL often fails to reach distant goals, but can successfully reach the goal if starting nearby (inside the green region). Reinforcement learning (RL) has seen a lot of progress over the past few years, tackling increasingly complex tasks. Much of this progress has been enabled by combining existing RL algorithms with powerful function approximators (i.e., neural networks). Function approximators have enabled researchers to apply RL to tasks with high-dimensional inputs without hand-crafting representations, distance metrics, or low-level controllers. However, function approximators have not come for free, and their cost is reflected in notoriously challenging optimization: deep RL algorithms are famously unstable and sensitive to hyperparameters.